Mitigating Data Imbalance Problem in Transformer-Based Intent Detection

نویسندگان

چکیده

There are two major problems when deploying a practical intent detection system for new customer. First, domain-specific data from the customer could be limited and imbalanced. Additionally, despite different customers might share same domain, their categories each other. Thus, it difficult to combine datasets collected into single larger one. In this paper, we use class weights in loss computation alleviate imbalance problem. The defined inversely proportional frequency of training set order give more influence less observed classes. We also employ two-pass fine-tuning procedure utilize information in-domain datasets. Experimental results show that performance is improved significantly weighted function used together with transfer learning procedure. absolute improvement percent accuracy approximately 2% over transformer-based baseline.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Class Imbalance Problem in Data Mining Review

In last few years there are major changes and evolution has been done on classification of data. As the application area of technology is increases the size of data also increases. Classification of data becomes difficult because of unbounded size and imbalance nature of data. Class imbalance problem become greatest issue in data mining. Imbalance problem occur where one of the two classes havi...

متن کامل

Geometric Mean based Boosting Algorithm to Resolve Data Imbalance Problem

In classification or prediction tasks, data imbalance problem is frequently observed when most of samples belong to one majority class. Data imbalance problem has received a lot of attention in machine learning community because it is one of the causes that degrade the performance of classifiers or predictors. In this paper, we propose geometric mean based boosting algorithm (GMBoost) to resolv...

متن کامل

Data Imbalance Problem solving for SMOTE Based Oversampling: Study on Fault Detection Prediction Model in Semiconductor Manufacturing Process

Fault detection prediction of FAB (wafer fabrication) process in semiconductor manufacturing process is possible that improve product quality and reliability in accordance with the classification performance. However, FAB process is sometimes due to a fault occurs. And mostly it occurs “pass”. Hence, data imbalance occurs in the pass/fail class. If the data imbalance occurs, prediction models a...

متن کامل

Class Imbalance Problem in Data Mining using Probabilistic Approach

Class imbalance problem are raised when one class having maximum number of examples than other classes. The classical classifiers of balance datasets cannot deal with the class imbalance problem because they pay more attention to the majority class. The main drawback associated with it majority class is loss of important information. The Class imbalance problem is a difficult due to the amount ...

متن کامل

Alleviating the Class Imbalance problem in Data Mining

The class imbalance problem in two-class data sets is one of the most important problems. When examples of one class in a training data set vastly outnumber examples of the other class, standard machine learning algorithms tend to be overwhelmed by the majority class and ignore the minority class. There are several algorithms to alleviate the problem of class imbalance in literature. In this pa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Europan journal of science and technology

سال: 2022

ISSN: ['2148-2683']

DOI: https://doi.org/10.31590/ejosat.1044812